fix(python): add backward compatibility for SandboxClaim status.sandbox.Name#515
Conversation
✅ Deploy Preview for agent-sandbox canceled.
|
|
Welcome @kincoy! |
|
Hi @kincoy. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
bd3b4e2 to
3d7bd64
Compare
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kincoy, SHRUTI6991 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Running the vitest integration suite against a real kind cluster (LANGCHAIN_SANDBOX_TEMPLATE=df-standard, LANGCHAIN_NAMESPACE=darkfactory, KUBECONFIG=bin/KUBECONFIG) surfaced three bugs inherited from the original source. All 97 standard tests now pass after these fixes. 1. K8sClient: loadFromCluster silently broken outside a pod `@kubernetes/client-node`'s loadFromCluster() does not throw when called outside a pod; it returns a config whose cluster server URL is `https://undefined:undefined`, which then fails at request time with "Invalid URL". Detect the in-pod environment explicitly via the KUBERNETES_SERVICE_HOST env var (set by kubelet for every pod) and only call loadFromCluster() there. Outside a pod, use loadFromDefault() which properly honors $KUBECONFIG. 2. K8sClient: status.sandbox.name vs Name backcompat Older controllers serialize the resolved sandbox name as `Name` (capital N) because the Go struct tag was omitted. Newer controllers use `name`. The TS client only checked `.name`, so resolveSandboxName timed out against the 11-day-old controller in the test cluster. Read both field names — mirrors the Python SDK's own backcompat fix in kubernetes-sigs#515. 3. K8sAgentSandbox.create: ignored initialFiles option The deepagents-js shared sandbox standard test suite passes initialFiles: Record<string, string> to the createSandbox callback to seed files before the agent runs. K8sAgentSandbox.create() took the options and dropped initialFiles on the floor, causing 6 tests in the initialFiles describe block to fail. Add the field to K8sAgentSandboxCreateOptions and upload after initialize(), aborting (and deleting the claim) on any upload failure. Test results: - 86 unit tests pass (unchanged) - 97 vitest int tests pass against the real kind cluster: - sandbox lifecycle (3) - command execution (6) - file operations (6) - write (10) - read (12) - edit (12) - ls (3) - grep (11) - glob (11) - initialFiles (7) — all now passing after fix kubernetes-sigs#3 - integration workflows (2) - K8sAgentSandbox provider-specific (5)
Ports the K8sAgentSandbox provider from
langchain-ai/deepagentsjs (feat/k8s-agent-sandbox branch) into this
repo as clients/typescript/langchain-agent-sandbox, mirroring the
Python clients/python/langchain-agent-sandbox package. Upgraded to
deepagents >=1.9 (the new V2 BackendProtocol) and integrated into
the repo's kind e2e suite. npm package name: langchain-agent-sandbox
(unscoped, matches the Python sibling).
## Package contents
clients/typescript/langchain-agent-sandbox/:
- src/sandbox.ts K8sAgentSandbox extends BaseSandbox with
execute / uploadFiles / downloadFiles / id +
static factories fromUrl / create / fromExisting /
deleteAll. File operations (ls / read / readRaw /
grep / glob / write / edit) are inherited from
the V2 BaseSandbox defaults that delegate to
execute() with POSIX shell.
- src/http-client.ts SandboxRouterClient — HTTP transport layer that
talks to the sandbox-router-svc with
X-Sandbox-ID / X-Sandbox-Namespace / X-Sandbox-Port
headers. Same wire protocol as the Python and
Go SDKs.
- src/k8s-client.ts K8sClient — thin wrapper around
@kubernetes/client-node for SandboxClaim CRUD,
resolveSandboxName, waitForSandboxReady, Gateway
IP discovery, and label-selector-filtered listing.
- src/connection.ts Three connection strategies: DirectConnectionStrategy,
GatewayConnectionStrategy, TunnelConnectionStrategy
(kubectl port-forward subprocess).
- src/types.ts Connection configs, K8sAgentSandboxOptions,
K8sAgentSandboxCreateOptions (including
initialFiles), K8sAgentSandboxError class with
typed code union including COMMAND_TIMEOUT.
- src/index.ts Public exports.
- src/*.test.ts 4 unit test files, 102 tests, fully mocked.
- src/sandbox.int.test.ts Integration suite wired to
@langchain/sandbox-standard-tests (97
behavior tests covering sandbox lifecycle,
command execution, file operations, write,
read, edit, ls, grep, glob, initialFiles,
and integration workflows).
- package.json, tsconfig.json, tsdown.config.ts, vitest.config.ts,
.gitignore, .npmignore, README.md, pnpm-lock.yaml.
examples/langchain-deepagentsjs/: minimal end-to-end example with
main.ts (Anthropic + deepagents + K8sAgentSandbox), package.json,
README.md, sandbox-template.yaml, .gitignore.
## Repo integration
- Makefile: new `test-langchainjs` target runs
`pnpm install --frozen-lockfile && pnpm typecheck && pnpm test`.
- .github/workflows/test-langchainjs.yml: PR-time unit test CI on
Node 20 + 22 matrix when clients/typescript/langchain-agent-sandbox
changes. Matches the existing minimal-permissions pattern.
- dev/tools/test-e2e: setup_typescript_sdk installs the package if
pnpm is present (hard fails on `pnpm install --frozen-lockfile`
failure to prevent stale-node_modules silent passes);
run_typescript_e2e_tests runs vitest --mode int with
KUBECONFIG=bin/KUBECONFIG and outputs bin/e2e-typescript-junit.xml.
Both functions short-circuit cleanly when pnpm, the package dir, or
LANGCHAIN_SANDBOX_TEMPLATE is missing, so they never block the Go
and Python suites from running.
- dev/ci/presubmits/test-e2e: copies the new junit artifact into
$ARTIFACTS when present.
- test/e2e/README.md: documents the shared sandbox-router dependency
for all three SDKs (Python, Go, TS) as a first-class deployment
prerequisite, with the full build + IMAGE_PLACEHOLDER substitution
+ kubectl apply workflow.
- Root README.md: TypeScript SDK section after the Python SDK
section.
## deepagents-js V2 protocol
The existing upstream code extended BaseSandbox (V1) and implemented
only the four abstract members (execute, uploadFiles, downloadFiles,
id). Against deepagents@^1.9.0 those four members satisfy the V2
SandboxBackendProtocolV2 contract unchanged, and the file operations
inherit from the V2 BaseSandbox defaults. No method rewrites were
required — only targeted hardening (see below).
## Post-port hardening
Eight bugs / gaps surfaced during the migration were fixed in this
same commit:
1. execute() COMMAND_TIMEOUT detection: the original port wrapped
every fetch() rejection in http-client.ts as CONNECTION_FAILED,
including the DOMException raised by AbortSignal.timeout. The new
http-client.ts lets errors named TimeoutError / AbortError
propagate without wrapping so execute()'s catch block can
distinguish them and emit a typed COMMAND_TIMEOUT error.
2. K8sClient constructor: @kubernetes/client-node's loadFromCluster()
does not throw outside a pod — it silently creates a config with
server URL https://undefined:undefined, which crashes at request
time. Fixed by checking process.env.KUBERNETES_SERVICE_HOST to
select loadFromCluster() vs loadFromDefault() explicitly. The
in-pod path still works unchanged.
3. resolveSandboxName: older controllers (pre upstream PR kubernetes-sigs#440)
serialize the field as status.sandbox.Name (capital N) because
the Go JSON tag was explicitly `json:"Name,omitempty"`. PR kubernetes-sigs#440
renamed it to `json:"name,omitempty"`. The extract callback now
reads both field names with lowercase precedence, mirroring the
Python SDK's backcompat shim in upstream PR kubernetes-sigs#515.
4. create() initialFiles: the deepagents-js shared standard test
suite passes `initialFiles: Record<string, string | Uint8Array>`
in createSandbox options to pre-populate files. Added the field
to K8sAgentSandboxCreateOptions and upload-after-initialize
handling with full cleanup on failure — on any upload error,
sandbox.close() is called first to tear down the tunnel subprocess
(preventing a `kubectl port-forward` leak), then the claim is
deleted.
5. listSandboxClaims + deleteAll label filtering: the original port
accepted a labels argument but silently ignored it, deleting
every claim in the namespace. listSandboxClaims now accepts an
optional labels record that is validated (RFC-1123 Kubernetes
label regex) and serialized into a labelSelector. Values
containing "," or "=" are rejected at serialization time so a
label like {owner: "alice,env=prod"} can't silently widen into a
two-predicate selector. deleteAll also refuses an empty labels
object unless the caller passes { confirmDeleteAll: true },
eliminating the original "delete everything" footgun by default.
6. execute() timeout distinction: new COMMAND_TIMEOUT error code
lets callers distinguish a per-command timeout from a generic
CONNECTION_FAILED / COMMAND_FAILED. Verified end-to-end against
a real kind cluster — `sleep 30` with defaultTimeout=3 terminates
at exactly 3005ms with the right typed error.
7. int test env-var gating: the LANGCHAIN_SANDBOX_TEMPLATE /
LANGCHAIN_NAMESPACE env var names match the Python langchain-
agent-sandbox e2e test for symmetry, with the legacy
SANDBOX_TEMPLATE / SANDBOX_NAMESPACE names kept as fallbacks.
sandboxStandardTests({ skip: !template, ... }) uses the shared
suite's built-in skip config so the run stays clean when the
template is unset.
8. docs: the sandbox-router Deployment manifest at
clients/python/agentic-sandbox-client/sandbox-router/sandbox_router.yaml
has a literal `image: IMAGE_PLACEHOLDER` line that requires
build + sed substitution before applying — applying it raw
produces a CrashLoopBackOff. Both READMEs now document the full
build + substitute + apply workflow and link to the router's own
README for configuration details.
## Verification
- 102 unit tests pass (`pnpm test`)
- 97 integration tests pass against a real kind cluster
(`LANGCHAIN_SANDBOX_TEMPLATE=df-standard
LANGCHAIN_NAMESPACE=darkfactory pnpm test:int`) — covers the
full deepagents-js shared standard suite (sandbox lifecycle,
command execution, file operations, write, read, edit, ls, grep,
glob, initialFiles, integration workflows) plus provider-specific
lifecycle tests.
- pnpm typecheck clean
- pnpm build produces dual ESM + CJS output
- make test-langchainjs clean
- All 14 .ts files carry the Apache-2.0 / Kubernetes Authors header
- No `any` introduced; #private fields used throughout
## Source attribution
Ported from langchain-ai/deepagentsjs feat/k8s-agent-sandbox branch
under Apache-2.0; the upstream source was MIT-licensed. All files
carry the kubernetes-sigs Apache-2.0 header.
Adds backward compatibility for
status.sandbox.namein the Python client.PR #440 renamed the field from
Nametonamebut the change isn't released yet. Older controllers still returnName, breaking things for users (#505).The code review bot in #440 actually recommended adding a fallback to check both fields, but it didn't make it into the final PR. This adds that fallback - it's just one line and lets the Python client work with any controller version.
Fixes #505